78 research outputs found

    Static Body Expression Recognition with Openpose

    Get PDF
    This thesis gives a reliable machine model that recognizes the action of "look down at phone" and distinguishes it from other similar actions in a given consecutive video. It first reproduces facial recognition research of dimpler. Then it moves on to introduce body action recognition and explains key factors like landmarks and Openpose used in the research. It then presents the action of "look down at phone" as the research focus and briefly mentions related works to the topic. Later on, the thesis presents methods on how facial expression is performed and quickly moves on the body expression detection techniques. For body expression detection, the thesis first explains the process with image inputs, then continues to explain the process for video samples. At the end of the methods chapter, it demonstrates how the machine model processes a 26-second video with complex actions and gives a reliable and correct estimation of the actions the video contains. The thesis later presents the results of this research and compares the estimation given by the machine model to the true answers of given samples. At last, the thesis concludes the effect and benefit of this machine model and suggests future works for this research

    TimbreTron: A WaveNet(CycleGAN(CQT(Audio))) Pipeline for Musical Timbre Transfer

    Full text link
    In this work, we address the problem of musical timbre transfer, where the goal is to manipulate the timbre of a sound sample from one instrument to match another instrument while preserving other musical content, such as pitch, rhythm, and loudness. In principle, one could apply image-based style transfer techniques to a time-frequency representation of an audio signal, but this depends on having a representation that allows independent manipulation of timbre as well as high-quality waveform generation. We introduce TimbreTron, a method for musical timbre transfer which applies "image" domain style transfer to a time-frequency representation of the audio signal, and then produces a high-quality waveform using a conditional WaveNet synthesizer. We show that the Constant Q Transform (CQT) representation is particularly well-suited to convolutional architectures due to its approximate pitch equivariance. Based on human perceptual evaluations, we confirmed that TimbreTron recognizably transferred the timbre while otherwise preserving the musical content, for both monophonic and polyphonic samples.Comment: 17 pages, published as a conference paper at ICLR 201

    Cash Transfers and Health

    Get PDF
    Financial resources are known to affect health outcomes. Many types of social policies and programs, including social assistance and social insurance, have been implemented around the world to increase financial resources. As an overall term, we refer to these as cash transfers. In this article, we discuss whether, how, for whom, and to what extent purposeful cash transfers may improve health, both theoretically and empirically. The overall finding is that cash transfers are very positive, but as usual, there are many complexities and variations. Continuing research and policy innovation—for example, universal basic income and universal Child Development Accounts—are likely to be productive. Forthcoming in the Annual Review of Public Health This paper is posted with permission from the Annual Review of Public Health, Volume 42. Copyright 2021 Annual Reviews, http://www.annualreviews.or

    Efficient Parametric Approximations of Neural Network Function Space Distance

    Full text link
    It is often useful to compactly summarize important properties of model parameters and training data so that they can be used later without storing and/or iterating over the entire dataset. As a specific case, we consider estimating the Function Space Distance (FSD) over a training set, i.e. the average discrepancy between the outputs of two neural networks. We propose a Linearized Activation Function TRick (LAFTR) and derive an efficient approximation to FSD for ReLU neural networks. The key idea is to approximate the architecture as a linear network with stochastic gating. Despite requiring only one parameter per unit of the network, our approach outcompetes other parametric approximations with larger memory requirements. Applied to continual learning, our parametric approximation is competitive with state-of-the-art nonparametric approximations, which require storing many training examples. Furthermore, we show its efficacy in estimating influence functions accurately and detecting mislabeled examples without expensive iterations over the entire dataset.Comment: 18 pages, 5 figures, ICML 202

    Cash Transfers and Health

    Get PDF
    Financial resources are known to affect health outcomes. Many types of social policies and programs, including social assistance and social insurance, have been implemented around the world to increase financial resources. We refer to these as cash transfers. In this article, we discuss theory and evidence on whether, how, for whom, and to what extent purposeful cash transfers improve health. Evidence suggests that cash transfers produce positive health effects, but there are many complexities and variations in the outcomes. Continuing research and policy innovation—for example, universal basic income and universal Child Development Accounts—are likely to be productive

    Safety Guaranteed Manipulation Based on Reinforcement Learning Planner and Model Predictive Control Actor

    Full text link
    Deep reinforcement learning (RL) has been endowed with high expectations in tackling challenging manipulation tasks in an autonomous and self-directed fashion. Despite the significant strides made in the development of reinforcement learning, the practical deployment of this paradigm is hindered by at least two barriers, namely, the engineering of a reward function and ensuring the safety guaranty of learning-based controllers. In this paper, we address these challenging limitations by proposing a framework that merges a reinforcement learning \lstinline[columns=fixed]{planner} that is trained using sparse rewards with a model predictive controller (MPC) \lstinline[columns=fixed]{actor}, thereby offering a safe policy. On the one hand, the RL \lstinline[columns=fixed]{planner} learns from sparse rewards by selecting intermediate goals that are easy to achieve in the short term and promising to lead to target goals in the long term. On the other hand, the MPC \lstinline[columns=fixed]{actor} takes the suggested intermediate goals from the RL \lstinline[columns=fixed]{planner} as the input and predicts how the robot's action will enable it to reach that goal while avoiding any obstacles over a short period of time. We evaluated our method on four challenging manipulation tasks with dynamic obstacles and the results demonstrate that, by leveraging the complementary strengths of these two components, the agent can solve manipulation tasks in complex, dynamic environments safely with a 100%100\% success rate. Videos are available at \url{https://videoviewsite.wixsite.com/mpc-hgg}
    • …
    corecore